All materials for this workshop is available in my standard GitHub repo:
https://github.com/kaybenleroll/dublin_r_workshops
book cover
The content of this workshop is based on the book “Statistical Analysis of Network Data with R” by Kolaczyk and Csardi. The data from this book is available from CRAN via the package sand and there is a GitHub repo for the code in the book also:
https://github.com/kolaczyk/sand
Additional ideas and elements and concepts were taken from the Coursera course “Social and Economic Networks” taught by Matthew O. Jackson
https://www.coursera.org/learn/social-economic-networks
In this workshop we are going to use three different networks as reference datasets to illustrate the concepts we discuss.
data(flo, package = 'network')
florence_igraph <- graph_from_adjacency_matrix(flo, mode = 'undirected')
plot(florence_igraph)
We can use the package ggnetwork to allow us to plot networks within ggplot2
florence_layout <- ggnetwork(florence_igraph, layout = 'fruchtermanreingold')
ggplot(florence_layout, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 5) +
theme_blank()
### Show graph vertices
V(florence_igraph)
## + 16/16 vertices, named, from 6cd687c:
## [1] Acciaiuoli Albizzi Barbadori Bischeri Castellani
## [6] Ginori Guadagni Lamberteschi Medici Pazzi
## [11] Peruzzi Pucci Ridolfi Salviati Strozzi
## [16] Tornabuoni
### Show graph edges
E(florence_igraph)
## + 20/20 edges from 6cd687c (vertex names):
## [1] Acciaiuoli--Medici Albizzi --Ginori Albizzi --Guadagni
## [4] Albizzi --Medici Barbadori --Castellani Barbadori --Medici
## [7] Bischeri --Guadagni Bischeri --Peruzzi Bischeri --Strozzi
## [10] Castellani--Peruzzi Castellani--Strozzi Guadagni --Lamberteschi
## [13] Guadagni --Tornabuoni Medici --Ridolfi Medici --Salviati
## [16] Medici --Tornabuoni Pazzi --Salviati Peruzzi --Strozzi
## [19] Ridolfi --Strozzi Ridolfi --Tornabuoni
To help access the edgelist in a more usable form we convert the edgelist to a matrix, showing the origin and destination nodes.
as_edgelist(florence_igraph)
## [,1] [,2]
## [1,] "Acciaiuoli" "Medici"
## [2,] "Albizzi" "Ginori"
## [3,] "Albizzi" "Guadagni"
## [4,] "Albizzi" "Medici"
## [5,] "Barbadori" "Castellani"
## [6,] "Barbadori" "Medici"
## [7,] "Bischeri" "Guadagni"
## [8,] "Bischeri" "Peruzzi"
## [9,] "Bischeri" "Strozzi"
## [10,] "Castellani" "Peruzzi"
## [11,] "Castellani" "Strozzi"
## [12,] "Guadagni" "Lamberteschi"
## [13,] "Guadagni" "Tornabuoni"
## [14,] "Medici" "Ridolfi"
## [15,] "Medici" "Salviati"
## [16,] "Medici" "Tornabuoni"
## [17,] "Pazzi" "Salviati"
## [18,] "Peruzzi" "Strozzi"
## [19,] "Ridolfi" "Strozzi"
## [20,] "Ridolfi" "Tornabuoni"
We also want to look at the adjacency matrix for this network
{ show_florence_adjacency, echo=TRUE} as_adjacency_matrix(florence_igraph)
The elements of a graph can all have attributes:
graph_attr_names(florence_igraph)
## character(0)
vertex_attr_names(florence_igraph)
## [1] "name"
edge_attr_names(florence_igraph)
## character(0)
Subgraphs are subsets of graphs that are part of the whole.
family_keep <- c('Medici', 'Barbadori', 'Ridolfi','Tornabuoni','Pazzi'
,'Salviati', 'Albizzi', 'Guadagni')
florence_subgraph <- induced_subgraph(florence_igraph, family_keep)
ggplot(florence_subgraph, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 5) +
theme_blank()
data(USairports)
### Show graph vertices
V(USairports)
## + 755/755 vertices, named, from bf6202d:
## [1] BGR BOS ANC JFK LAS MIA EWR BJC TEB LAX AEX BFI ELM GEG ICT PBI PIT SFO
## [19] VCT IAD ABE AGS AVL AVP BDL BHM BNA BTR BUF BWI CAE CAK CHA CHO CHS CLE
## [37] CLT CMH CRW CVG DAB DAY DCA DTW EWN FAY GNV GPT GSO GSP HPN HSV ILM IND
## [55] JAN LEX LGA LIT MDT MGM MKE MLB MOB MSP MSY MYR OAJ ORF PGV PHF PHL PNS
## [73] PWM RDU RIC ROA SAV SDF SRQ STL SYR TLH TRI TYS VPS XNA ALB BGM BTV ERI
## [91] FLO HHH HTS HVN IPT ISP ITH LYH MHT PVD ROC SBY SCE SWF BFD DUJ EYW FKL
## [109] FLL JHW LWB MCO PKB TPA ACT AOO BHB BKW BPT CKB CLL DRT GRK IAH JST LCH
## [127] LFT MAF MGW MLU ORD PBG PQI SHD SHV TYR CID MCI MLI MSN OKC OMA SBN SGF
## [145] TUL GKN ABQ ATL AUS BKG DEN DFW HOU MDW PDX PHX PIE RSW SAN SAT SEA SLC
## [163] SMF SNA TUS ABY ACY BMI DSM FNT GLH GRR GTR JAX MEM SJU UTM GUM ROP SPN
## + ... omitted several vertices
### Show graph edges
E(USairports)
## + 23473/23473 edges from bf6202d (vertex names):
## [1] BGR->JFK BGR->JFK BOS->EWR ANC->JFK JFK->ANC LAS->LAX MIA->JFK EWR->ANC
## [9] BJC->MIA MIA->BJC TEB->ANC JFK->LAX LAX->JFK LAX->SFO AEX->LAS BFI->SBA
## [17] ELM->PIT GEG->SUN ICT->PBI LAS->LAX LAS->PBI LAS->SFO LAX->LAS PBI->AEX
## [25] PBI->ICT PIT->VCT SFO->LAX VCT->DWH IAD->JFK ABE->CLT ABE->HPN AGS->CLT
## [33] AGS->CLT AVL->CLT AVL->CLT AVP->CLT AVP->PHL BDL->CLT BHM->CLT BHM->CLT
## [41] BNA->CLT BNA->CLT BNA->DCA BNA->PHL BTR->CLT BUF->CLT BUF->DCA BUF->PHL
## [49] BWI->PHL CAE->CLT CAE->CLT CAE->DCA CAK->CLT CAK->CLT CAK->DCA CAK->PHL
## [57] CHA->CLT CHA->DCA CHO->CLT CHS->CLT CHS->CLT CHS->DCA CLE->CLT CLE->CLT
## [65] CLE->PHL CLT->ABE CLT->AGS CLT->AGS CLT->AVL CLT->AVL CLT->AVP CLT->BDL
## [73] CLT->BHM CLT->BHM CLT->BNA CLT->BNA CLT->BTR CLT->BUF CLT->BWI CLT->CAE
## + ... omitted several edges
This is a much larger network, and visualising it is likely going to be a mess, but we will try anyway.
ggplot(USairports, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 5) +
theme_blank()
Yeah, it is a mess.
We will try again with a small subgraph, using just 15 nodes
usairport_subgraph <- induced_subgraph(USairports, 1:15)
ggplot(usairport_subgraph, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 5) +
theme_blank()
Now that we have seen this network, we look at what additional information is here.
graph_attr_names(USairports)
## [1] "name"
vertex_attr_names(USairports)
## [1] "name" "City" "Position"
edge_attr_names(USairports)
## [1] "Carrier" "Departures" "Seats" "Passengers" "Aircraft"
## [6] "Distance"
We see that the edges in particular now have a number of attributes. We access them through edge_attr()
edge_attr(USairports) %>%
as_tibble()
## # A tibble: 23,473 x 6
## Carrier Departures Seats Passengers Aircraft Distance
## <chr> <dbl> <dbl> <dbl> <int> <dbl>
## 1 British Airways Plc 1 226 193 627 382
## 2 British Airways Plc 1 299 253 819 382
## 3 British Airways Plc 1 216 141 627 200
## 4 China Airlines Ltd. 13 5161 3135 819 3386
## 5 China Airlines Ltd. 13 5161 4097 819 3386
## 6 Korean Air Lines Co. Ltd. 14 3654 1353 627 236
## 7 Lan Ecuador 1 204 183 626 1090
## 8 Eva Airways Corporation 18 5718 4818 627 3370
## 9 G5 Executive Ag 1 14 2 667 1732
## 10 G5 Executive Ag 1 14 1 667 1732
## # ... with 23,463 more rows
data(lazega)
lazega <- lazega %>% upgrade_graph() # Data is in deprecated format.
### Show graph vertices
V(lazega)
## + 36/36 vertices, named, from abac241:
## [1] V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19
## [20] V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36
### Show graph edges
E(lazega)
## + 115/115 edges from abac241 (vertex names):
## [1] V1 --V17 V2 --V7 V2 --V16 V2 --V17 V2 --V22 V2 --V26 V2 --V29 V3 --V18
## [9] V3 --V25 V3 --V28 V4 --V12 V4 --V17 V4 --V19 V4 --V20 V4 --V22 V4 --V26
## [17] V4 --V28 V4 --V29 V4 --V31 V5 --V18 V5 --V24 V5 --V28 V5 --V31 V5 --V32
## [25] V5 --V33 V6 --V24 V6 --V28 V6 --V30 V6 --V31 V6 --V32 V7 --V18 V9 --V12
## [33] V9 --V16 V9 --V29 V10--V24 V10--V26 V10--V29 V10--V31 V10--V34 V11--V17
## [41] V12--V15 V12--V16 V12--V17 V12--V19 V12--V26 V12--V29 V12--V34 V13--V31
## [49] V13--V33 V14--V16 V14--V17 V14--V25 V14--V28 V14--V30 V14--V32 V15--V16
## [57] V15--V19 V15--V20 V15--V22 V15--V24 V15--V26 V15--V29 V15--V32 V15--V35
## [65] V15--V36 V16--V17 V16--V22 V16--V26 V16--V27 V16--V29 V16--V32 V16--V34
## [73] V16--V36 V17--V19 V17--V22 V17--V24 V17--V25 V17--V26 V17--V28 V17--V29
## + ... omitted several edges
lazega_igraph <- lazega %>% upgrade_graph()
lazega_network <- lazega %>% intergraph::asNetwork()
ggplot(lazega, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 3) +
theme_blank()
graph_attr_names(lazega)
## character(0)
vertex_attr_names(lazega)
## [1] "name" "Seniority" "Status" "Gender" "Office" "Years"
## [7] "Age" "Practice" "School"
edge_attr_names(lazega)
## character(0)
We see that the vertices have the attributes but the edges have none.
vertex_attr(lazega) %>%
as_tibble()
## # A tibble: 36 x 9
## name Seniority Status Gender Office Years Age Practice School
## <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 V1 1 1 1 1 31 64 1 1
## 2 V2 2 1 1 1 32 62 2 1
## 3 V3 3 1 1 2 13 67 1 1
## 4 V4 4 1 1 1 31 59 2 3
## 5 V5 5 1 1 2 31 59 1 2
## 6 V6 6 1 1 2 29 55 1 1
## 7 V7 7 1 1 2 29 63 2 3
## 8 V8 8 1 1 1 28 53 1 3
## 9 V9 9 1 1 1 25 53 2 1
## 10 V10 10 1 1 1 25 53 2 3
## # ... with 26 more rows
NetScience.net.plot and ggplot2.Plotting network data is not automatic - mathematical concepts allow us to convert network topology into a form amenable to plotting.
plot_1 <- ggplot(florence_igraph, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 3) +
theme_blank()
plot_2 <- ggplot(florence_igraph, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 3) +
theme_blank()
plot_3 <- ggplot(florence_igraph, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 3) +
theme_blank()
plot_4 <- ggplot(florence_igraph, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 3) +
theme_blank()
plot_grid(plot_1, plot_2, plot_3, plot_4, ncol = 2)
To plot graphs visually, we need a way to transform the graphs into 2D coordinates. A number of layout algorithms exist.
To show the differences, we plot the Florentine network using a number of different layout algorithms.
ggplot(ggnetwork(florence_igraph, layout = 'fruchtermanreingold')
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
ggtitle('The Florentine Network Using Fruchterman-Reingold Layout') +
theme_blank()
ggplot(ggnetwork(florence_igraph, layout = 'spring')
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
ggtitle('The Florentine Network Using Spring Layout') +
theme_blank()
ggplot(ggnetwork(florence_igraph, layout = 'mds')
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
ggtitle('The Florentine Network Using MDS Layout') +
theme_blank()
ggplot(ggnetwork(florence_igraph, layout = 'circle')
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
ggtitle('The Florentine Network Using Circular Layout') +
theme_blank()
We save the FR layout for future plotting so that all plots looks the same
florentine_fr_layout <- ggnetwork(florence_igraph, 'fruchtermanreingold')
The degree of a vertex is the count of connections from that vertex.
We now look at the distribution of vertex degree for the Florentine network:
ggplot() +
geom_bar(aes(x = igraph::degree(florence_igraph))) +
xlab("Vertex Degree") +
ylab("Count of Degrees")
The edge density is the ratio of the edge count on the graph with the total possible count of edges on the graph.
From combinatorics, the total possible count of edges is \(\frac{N(N-1)}{2}\).
Thus, for a network of order \(N_v\) and size \(N_e\), the density is given by
\[ \text{density} = \frac{2 N_v}{N_e (N_e - 1)} \]
florence_igraph %>% edge_density
## [1] 0.1666667
Another interesting quantity is the average degree of all the neighbours of each vertex.
flor_knn <- knn(florence_igraph)$knn
flor_knn_tbl <- data_frame(family = names(flor_knn)
,degree = igraph::degree(florence_igraph)
,knn = flor_knn
)
ggplot(flor_knn_tbl) +
geom_point(aes(x = degree, y = knn)) +
geom_text_repel(aes(x = degree, y = knn, label = family)) +
expand_limits(y = 0) +
xlab("Vertex Degree") +
ylab("KNN Degree")
Between-ness measures how often a vertex comes between two other vertices in the graph.
\[ c_B(\nu) = \sum_{s \neq t \neq \nu \in V} \frac{\sigma(s, t | \nu)}{\sigma(s,t)} \]
where \(\sigma(s, t| \nu)\) is the count of shortest paths between \(s\) and \(t\) that goes through \(\nu\) and \(\sigma(s, t)\) is the total number of shortest paths between \(s\) and \(t\).
We now calculate the between-ness centrality for each vertex in the Florence marriage network
florence_betweenness <- florence_igraph %>%
(igraph::betweenness)() %>%
sort(decreasing = TRUE)
ggplot() +
geom_col(aes(x = names(florence_betweenness), y = florence_betweenness)) +
xlab("Family") +
ylab("Between-ness") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
Closeness centrality is measure of the distance of the vertex from other vertices in the graph.
\[ c_{Cl}(\nu) = \frac{1}{\sum_{u \in V} \text{dist}(u, \nu)} \]
florence_closeness <- florence_igraph %>%
(igraph::closeness)() %>%
sort(decreasing = TRUE)
ggplot() +
geom_col(aes(x = names(florence_closeness), y = florence_closeness)) +
xlab("Family") +
ylab("Closeness") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
Eigenvector centrality is a class of centrality measures based on linear systems derived from the graph.
The most common of these are the eigenvectors of the adjacency matrix produced from the graph.
The key idea here is that vertices which are ‘central’ in the network are so due to their neighbours being ‘central’. This concept is inherently implicit in nature and so is calculated via linear algebra means.
florence_eigencent <- florence_igraph %>%
eigen_centrality() %>%
.$vector %>%
sort(decreasing = TRUE)
ggplot() +
geom_col(aes(x = names(florence_eigencent), y = florence_eigencent)) +
xlab("Family") +
ylab("Eigenvector Centrality") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
Another way of thinking about centrality is a point that joins two components - that is, removing the vertex increases the count of components of the graph.
Identifying articulation points may highlight vulnerabilties in the network, or help identify key vertices that would otherwise be overlooked in analysis.
florence_artic <- florence_igraph %>% articulation_points() %>% names()
artic_label_tbl <- data_frame(vertex.names = V(florence_igraph) %>% names()) %>%
mutate(is_artic = map_lgl(vertex.names, function(x) x %in% florence_artic))
florentine_plot_layout <- florentine_fr_layout %>%
merge(artic_label_tbl, by = 'vertex.names')
florentine_artic_plot <- ggplot(florentine_plot_layout
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_nodelabel(aes(label = vertex.names, fill = is_artic)) +
theme_blank(legend.position = 'none')
florentine_artic_plot %>% plot()
Edge betweenness is a similar idea as for between-ness centrality - we want to look at which edges are the most influential in the network.
florence_edge_names <- florence_igraph %>%
as_edgelist() %>%
as_tibble() %>%
mutate(edge_name = paste0(V1, '--', V2)) %>%
pull(edge_name)
florence_edge_betweenness <- florence_igraph %>%
igraph::edge_betweenness()
florence_edge_between_tbl <- data_frame(
edge_names = florence_edge_names
,edge_betweenness = florence_edge_betweenness
)
ggplot(florence_edge_between_tbl) +
geom_col(aes(x = edge_names, y = edge_betweenness)) +
xlab("Edge Names") +
ylab("Edge Betweenness") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
Many measures of vertex centrality do not transfer as readily as betweenness centrality. For that reason, we can convert a network into a line graph - each edge in the original graph becomes a vertex in its line graph and we connect two of the ‘edge nodes’ with an edge if the edge shares a vertex in the original.
florence_linegraph_igraph <- florence_igraph %>%
make_line_graph()
florence_linegraph_igraph <- florence_linegraph_igraph %>%
set_vertex_attr(name = 'name', value = florence_edge_names)
ggplot(florence_linegraph_igraph, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 2) +
theme_blank()
We can now use the line graph to discover information about the edges in the original graph.
A clique is a ‘maximally-connected subgraph’, that is, it is a subset of the nodes of the graph that are all connected to one another.
clique_size <- florence_igraph %>%
cliques %>%
map_int(length)
ggplot() +
geom_bar(aes(x = clique_size)) +
xlab("Clique Size") +
ylab("Count")
The transitivity of the network is a measure of the ‘density’ of connections in the network. It is the ratio of triangles in the network to possible triangles.
Local transitivity does the same for all triangles contain the vertex.
florence_igraph %>%
transitivity()
## [1] 0.1914894
florence_igraph %>%
transitivity(type = 'local', vids = c('Strozzi', 'Guadagni', 'Medici'))
## [1] 0.33333333 0.00000000 0.06666667
The shortest path measures are measures of the size and connectivity of the graph
florence_igraph %>% mean_distance()
## [1] 2.485714
florence_igraph %>% diameter()
## [1] 5
florence_partition <- cluster_fast_greedy(florence_igraph)
florence_partition %>% print
## IGRAPH clustering fast greedy, groups: 4, mod: 0.4
## + groups:
## $`1`
## [1] "Acciaiuoli" "Medici" "Pazzi" "Ridolfi" "Salviati"
## [6] "Tornabuoni"
##
## $`2`
## [1] "Albizzi" "Ginori" "Guadagni" "Lamberteschi"
##
## $`3`
## [1] "Barbadori" "Bischeri" "Castellani" "Peruzzi" "Strozzi"
##
## + ... omitted several groups/vertices
florence_partition %>% str()
## List of 4
## $ merges : chr [1:6] "Acciaiuoli" "Medici" "Pazzi" "Ridolfi" ...
## $ modularity: chr [1:4] "Albizzi" "Ginori" "Guadagni" "Lamberteschi"
## $ membership: chr [1:5] "Barbadori" "Bischeri" "Castellani" "Peruzzi" ...
## $ names : chr "Pucci"
## - attr(*, "class")= chr "communities"
We can now replot the network but colour each of the nodes by their cluster membership.
hier_label_tbl <- data_frame(
vertex.names = V(florence_igraph) %>% names()
,cluster_hier = florence_partition %>% membership() %>% as.character()
)
florentine_plot_layout <- florentine_fr_layout %>%
merge(hier_label_tbl, by = 'vertex.names')
cluster_hier_plot <- ggplot(florentine_plot_layout
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_nodelabel(aes(label = vertex.names, fill = cluster_hier)) +
theme_blank()
cluster_hier_plot %>% plot()
We can use more direct linear algebra routines to partition the graph. To do this we construct the ‘graph Laplacian’ from the degrees of each vertex and its adjacency matrix.
\[ \mathbf{L} = \mathbf{D} - \mathbf{A} \]
By analysing the eigenvalues and eigenvectors of this matrix, and recursively applying splits to the graphs based on the size of the eigenvalues, we break this network into pieces.
florency_laplacian <- laplacian_matrix(florence_igraph)
flor_laplac_eigen <- eigen(florency_laplacian)
flor_laplac_eigen %>% print(digits = 2)
## eigen() decomposition
## $values
## [1] 7.3e+00 5.5e+00 5.3e+00 4.3e+00 3.6e+00 3.4e+00 2.6e+00 2.5e+00
## [9] 1.6e+00 1.5e+00 8.0e-01 7.0e-01 5.3e-01 3.5e-01 3.3e-16 -2.4e-16
##
## $vectors
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] -1.4e-01 4.7e-02 -1.6e-02 -4.3e-02 3.1e-02 -1.7e-02 -3.0e-03 6.8e-02
## [2,] -2.5e-01 -1.8e-01 1.5e-01 1.1e-01 -3.2e-01 6.0e-01 4.2e-01 -1.6e-01
## [3,] -1.7e-01 1.4e-01 4.8e-02 1.6e-01 -1.4e-01 -3.3e-01 2.3e-01 4.3e-02
## [4,] -5.0e-02 -4.6e-01 -4.3e-02 -1.8e-01 -1.8e-01 -5.1e-01 3.3e-01 -1.8e-01
## [5,] 2.7e-02 -2.7e-01 -2.3e-01 -5.0e-01 3.1e-01 4.1e-01 -1.5e-01 8.2e-02
## [6,] 4.0e-02 3.9e-02 -3.5e-02 -3.4e-02 1.2e-01 -2.5e-01 -2.6e-01 1.1e-01
## [7,] 1.6e-01 6.2e-01 -3.9e-01 -2.5e-01 1.5e-01 -2.0e-03 4.1e-01 -8.4e-02
## [8,] -2.6e-02 -1.4e-01 9.0e-02 7.6e-02 -5.9e-02 8.3e-04 -2.5e-01 5.5e-02
## [9,] 8.7e-01 -2.1e-01 6.8e-02 1.4e-01 -8.2e-02 4.1e-02 4.8e-03 -1.0e-01
## [10,] 2.7e-02 -1.4e-02 5.0e-03 2.2e-02 -2.6e-02 1.8e-02 1.9e-01 4.9e-01
## [11,] -9.2e-03 1.3e-01 -1.6e-01 7.2e-01 2.2e-01 1.4e-01 -7.7e-02 -7.0e-02
## [12,] -8.5e-22 -3.5e-18 -6.9e-18 1.4e-17 1.1e-16 1.1e-16 -8.3e-17 1.4e-17
## [13,] -1.7e-01 -2.0e-02 -4.4e-01 2.7e-02 -4.5e-01 -7.5e-03 -3.9e-01 2.7e-01
## [14,] -1.7e-01 6.4e-02 -2.2e-02 -7.2e-02 6.7e-02 -4.3e-02 -3.0e-01 -7.4e-01
## [15,] 6.2e-02 4.2e-01 6.5e-01 -2.4e-01 -2.6e-01 4.9e-02 -2.1e-01 6.6e-02
## [16,] -2.0e-01 -1.5e-01 3.3e-01 6.3e-02 6.2e-01 -8.7e-02 6.0e-02 1.7e-01
## [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
## [1,] -3.4e-01 2.1e-01 7.5e-01 -4.1e-01 -1.2e-02 9.3e-02 -0.0012 -0.2582
## [2,] 5.2e-02 -8.4e-02 -7.3e-02 -1.0e-01 -3.3e-01 -7.5e-02 -0.0012 -0.2582
## [3,] 6.6e-01 4.1e-01 -2.7e-02 -1.7e-01 2.1e-01 -5.1e-02 -0.0012 -0.2582
## [4,] -3.8e-01 -8.6e-02 -1.5e-01 7.6e-02 1.8e-01 -1.8e-01 -0.0012 -0.2582
## [5,] 9.5e-02 3.0e-01 -1.8e-01 -9.4e-02 3.1e-01 -1.5e-01 -0.0012 -0.2582
## [6,] -9.1e-02 1.5e-01 -3.7e-01 -3.3e-01 -6.9e-01 -1.1e-01 -0.0012 -0.2582
## [7,] -2.5e-02 -1.6e-01 5.6e-02 2.2e-01 -1.1e-01 -1.5e-01 -0.0012 -0.2582
## [8,] 4.4e-02 3.0e-01 2.8e-01 7.5e-01 -2.4e-01 -2.2e-01 -0.0012 -0.2582
## [9,] 1.9e-01 -1.2e-01 1.5e-01 -1.2e-01 -5.5e-03 6.1e-02 -0.0012 -0.2582
## [10,] -1.5e-01 9.3e-02 -2.0e-01 2.0e-01 1.8e-02 7.4e-01 -0.0012 -0.2582
## [11,] -3.3e-01 1.0e-01 -2.2e-01 -2.0e-02 3.0e-01 -1.8e-01 -0.0012 -0.2582
## [12,] -1.1e-16 -1.7e-16 8.3e-17 1.7e-16 -6.7e-16 1.7e-16 1.0000 -0.0047
## [13,] 1.6e-01 -4.8e-01 5.1e-02 -5.8e-02 1.0e-01 -5.5e-02 -0.0012 -0.2582
## [14,] 8.6e-02 -5.0e-02 -3.9e-02 6.1e-02 8.6e-03 4.8e-01 -0.0012 -0.2582
## [15,] -1.9e-01 -6.7e-02 -1.6e-01 -2.9e-02 2.6e-01 -1.5e-01 -0.0012 -0.2582
## [16,] 2.2e-01 -5.2e-01 1.2e-01 1.9e-02 -7.2e-03 -5.3e-02 -0.0012 -0.2582
We look at the eigenvalues ranked in order.
ggplot() +
geom_line(aes(x = seq_along(flor_laplac_eigen$values)
,y = flor_laplac_eigen$values)) +
expand_limits(y = 0) +
xlab("Eigenvalue Ranking") +
ylab("Eigenvalue")
We now cluster using these spectral methods
florence_spec_partition <- florence_igraph %>%
cluster_leading_eigen()
spec_label_tbl <- data_frame(
vertex.names = V(florence_igraph) %>% names()
,cluster_spec = florence_spec_partition %>% membership() %>% as.character()
)
florentine_plot_layout <- florentine_fr_layout %>%
merge(spec_label_tbl, by = 'vertex.names')
cluster_spec_plot <- ggplot(florentine_plot_layout
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_nodelabel(aes(label = vertex.names, fill = cluster_spec)) +
theme_blank()
cluster_spec_plot %>% plot()
We plot the two groupings beside each other to compare them.
plot_grid(cluster_hier_plot, cluster_spec_plot, ncol = 2)
Assortativity is a measure, analogous to correlation, that measures the tendency for nodes will similar properties to connect to one another.
The Florentine marriage data is a little unusual in that it does not contain any properties on the vertices or edges, but assortativity in degree can be calculated - measuring the tendency for high degree nodes to connect to one another)
assortativity_degree(florence_igraph)
## [1] -0.3748379
There are slightly different calculations for assortativity, depending on whether the attribute is numerical or categorical.
To test this, we will add the hierarchical clustering ID from the previous section to the Florentine graph and then measure the assortivity associated with that attribute.
assortativity_nominal(florence_igraph, membership(florence_partition))
## [1] 0.6146435
assortativity_nominal(florence_igraph, membership(florence_spec_partition))
## [1] 0.5480226
We now move on to modelling graph data using statistical methods.
To begin, we start with very simple generative processes for graphs, investigating how we can use these methods to approximate data we have.
We start with basic statistical models where models are produced purely at random to match basic measures of graphs such as node and edge count, degree distributions and so on.
The building block for these are Erdos-Renyi models, probably the simplest models we can produce.
The simplest random graph model is one where we have a fixed number of nodes and have either a fixed count of edges with equally likely probability - the \(G(n,m)\) model, or we assign each edge a fixed probability of occurring - the \(G(n,p)\) model.
We start with the \(G(n,m)\) model on a network with 50 nodes so that processing and visualisation is fast.
gnmsample_igraph <- sample_gnm(50, 75)
ggplot(ggnetwork(gnmsample_igraph, layout = 'fruchtermanreingold')
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
ggtitle('Sample G(n,m) Graph') +
theme_blank()
ggplot(ggnetwork(gnmsample_igraph, layout = 'circle')
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
ggtitle('Sample G(n,m) Graph with Circular Layout') +
theme_blank()
Similarly, we generate a \(G(n, p)\) graph.
gnpsample_igraph <- sample_gnp(50, 0.05)
ggplot(ggnetwork(gnpsample_igraph, layout = 'fruchtermanreingold')
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
ggtitle('Sample G(n,p) Graph') +
theme_blank()
ggplot(ggnetwork(gnpsample_igraph, layout = 'circle')
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
ggtitle('Sample G(n,p) Graph with Circular Layout') +
theme_blank()
Expanding this concept, we can generate graphs based on more advanced measures of the graph, such as the degree distribution.
To show how this works, we create a 50-node graph where each node has a degree between 1 and 4.
sample_degreedist <- sample(1:4, 50, replace = TRUE)
degdistsample_igraph <- sample_degseq(sample_degreedist, method = 'simple.no.multiple')
ggplot(ggnetwork(degdistsample_igraph)
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
ggtitle('Sample Degree Distribution Graph') +
theme_blank()
More advanced algorithms exist to construct random graphs on other characteristics, but most of those rely on Markov Chain Monte Carlo methods and are beyond the scope of this workshop.
Now that we have a few methods for producing these random graphs, the next logical issue is assessing how well these models capture aspects of our data.
As an example, we use the Florentine dataset, and produce some random graphs that match our data and compare the other measures such as clustering, diameter and average path length to what our models produce.
NOTE: This code may look a little cryptic and overly-concise at first, as I use functional methods to produce the simulations. There is nothing fancy happening here, so look up the functions in purrr if you get confused.
n_iter <- 1000
flor_count_node <- gorder(florence_igraph)
flor_count_edge <- gsize (florence_igraph)
sim_data_tbl <- data_frame(sim_id = 1:n_iter) %>%
mutate(graph = rerun(n_iter, sample_gnm(n = flor_count_node, flor_count_edge))
,trans = map_dbl(graph, transitivity)
,diam = map_dbl(graph, diameter)
,meandist = map_dbl(graph, mean_distance)
,max_degree = map_dbl(graph, function(x) x %>% igraph::degree() %>% max)
,n_comp = map_dbl(graph, function(x) x %>% count_components)
,n_clust = map_dbl(graph, function(x) x %>% cluster_fast_greedy() %>% length)
)
graph_vals_tbl <- data_frame(
parameter = c('trans','diam','meandist', 'max_degree', 'n_comp', 'n_clust')
,graph_val = c(florence_igraph %>% transitivity
,florence_igraph %>% diameter
,florence_igraph %>% mean_distance
,florence_igraph %>% igraph::degree() %>% max
,florence_igraph %>% count_components()
,florence_igraph %>% cluster_fast_greedy() %>% length()
)
)
plot_data_tbl <- sim_data_tbl %>%
select(-graph) %>%
gather('parameter','value', -sim_id)
ggplot(plot_data_tbl) +
geom_histogram(aes(x = value), bins = 50) +
geom_vline(aes(xintercept = graph_val), colour = 'red', data = graph_vals_tbl) +
facet_wrap(~parameter, scales = 'free') +
scale_y_continuous(label = comma) +
xlab('Value') +
ylab('Count')
We do something similar for the G(n,p) model
n_iter <- 1000
flor_count_node <- gorder(florence_igraph)
flor_count_edge <- gsize (florence_igraph)
edge_prop <- flor_count_edge / (0.5 * flor_count_node * (flor_count_node-1))
sim_data_tbl <- data_frame(sim_id = 1:n_iter) %>%
mutate(graph = rerun(n_iter, sample_gnp(n = flor_count_node, p = edge_prop))
,trans = map_dbl(graph, transitivity)
,diam = map_dbl(graph, diameter)
,meandist = map_dbl(graph, mean_distance)
,max_degree = map_dbl(graph, function(x) x %>% igraph::degree() %>% max)
,n_comp = map_dbl(graph, function(x) x %>% count_components)
,n_clust = map_dbl(graph, function(x) x %>% cluster_fast_greedy() %>% length)
)
graph_vals_tbl <- data_frame(
parameter = c('trans','diam','meandist', 'max_degree', 'n_comp', 'n_clust')
,graph_val = c(florence_igraph %>% transitivity
,florence_igraph %>% diameter
,florence_igraph %>% mean_distance
,florence_igraph %>% igraph::degree() %>% max
,florence_igraph %>% count_components()
,florence_igraph %>% cluster_fast_greedy() %>% length()
)
)
plot_data_tbl <- sim_data_tbl %>%
select(-graph) %>%
gather('parameter','value', -sim_id)
ggplot(plot_data_tbl) +
geom_histogram(aes(x = value), bins = 50) +
geom_vline(aes(xintercept = graph_val), colour = 'red', data = graph_vals_tbl) +
facet_wrap(~parameter, scales = 'free') +
scale_y_continuous(label = comma) +
xlab('Value') +
ylab('Count')
The \(G(n,p)\) model looks very similar, as expected.
Finally, We also try the degree distribution sample
flor_degdist <- igraph::degree(florence_igraph)
sim_data_tbl <- data_frame(sim_id = 1:n_iter) %>%
mutate(graph = rerun(n_iter, sample_degseq(flor_degdist) %>% simplify())
,trans = map_dbl(graph, transitivity)
,diam = map_dbl(graph, diameter)
,meandist = map_dbl(graph, mean_distance)
,max_degree = map_dbl(graph, function(x) x %>% igraph::degree() %>% max)
,n_comp = map_dbl(graph, function(x) x %>% count_components)
,n_clust = map_dbl(graph, function(x) x %>% cluster_fast_greedy() %>% length)
)
graph_vals_tbl <- data_frame(
parameter = c('trans','diam','meandist', 'max_degree', 'n_comp','n_clust')
,graph_val = c(florence_igraph %>% transitivity
,florence_igraph %>% diameter
,florence_igraph %>% mean_distance
,florence_igraph %>% igraph::degree() %>% max
,florence_igraph %>% count_components()
,florence_igraph %>% cluster_fast_greedy() %>% length()
)
)
plot_data_tbl <- sim_data_tbl %>%
select(-graph) %>%
gather('parameter','value', -sim_id)
ggplot(plot_data_tbl) +
geom_histogram(aes(x = value), bins = 50) +
geom_vline(aes(xintercept = graph_val), colour = 'red', data = graph_vals_tbl) +
facet_wrap(~parameter, scales = 'free') +
scale_y_continuous(label = comma) +
xlab('Value') +
ylab('Count')
While small and useful to illustrate the basics, the Florentine marriage network may not be a sound example for the purposes of illustrating the quality of these statistical models.
Due to its small size, the number of possible networks with these node and edge counts is low, so it is likely that any random graph will agree with it because of this.
It is more instructive to try larger networks, and see how effective these simple models are at reconstructing them.
SPOILER ALERT: They kinda suck at it
We now try the above models on the Lazega data
lazega_count_node <- gorder(lazega_igraph)
lazega_count_edge <- gsize (lazega_igraph)
run_gnm <- function() sample_gnm(n = lazega_count_node, lazega_count_edge)
lazega_gnm_lst <- run_network_model_assessment(lazega_igraph, run_gnm, n_iter = 1000)
plot(lazega_gnm_lst$assess_plot)
As you can see, the Lazega network is larger than the Florentine network (though still small in absolute terms) and we already see that the observed values of the network differ from our simulations.
The clustering coefficient in the Lazega network in particular is not well captured by the model.
We try the degree distribution model too, and see if that does a better job.
lazega_degdist <- lazega_igraph %>% igraph::degree()
run_degdist <- function() sample_degseq(lazega_degdist) %>% simplify()
lazega_degdist_lst <- run_network_model_assessment(lazega_igraph, run_degdist, n_iter = 1000)
plot(lazega_degdist_lst$assess_plot)
We see similar results to before, but once again the clustering is not well captured.
Our basic random graph models do not capture the higher levels of clustering observed in real-world networks.
The basic small world model is the Watts-Strogatz model. This creates a lattice network of size \(N\), connecting all neighbours within a particular path length \(k\), giving us a total edge count of \(Nk\). We then randomly move the edges to other nodes with probabiity \(p\).
For \(p=0\), we have a transitivity value \(C(p)\) of
\[ C(p) = \frac{3(k-2)}{4(k-1)} . (1 - p)^3 \]
To fit this model to real data, we set \(k\) from the edge count, and then fit the appropriate \(p\) to match our observed transitivity.
lazega_node_count <- lazega_igraph %>% vcount()
lazega_edge_count <- lazega_igraph %>% ecount()
lazega_cluster <- lazega_igraph %>% transitivity()
lazega_k <- (lazega_edge_count / lazega_node_count) %>% ceiling()
calc_trans <- function(p_iter) {
trans <- rerun(10, sample_smallworld(1, lazega_node_count, lazega_k, p_iter) %>% transitivity) %>%
unlist() %>%
mean()
return(trans)
}
lazega_p <- optimize(function(x) abs(calc_trans(x) - lazega_cluster), c(0.01, 0.2))$minimum
run_ws <- function() sample_smallworld(1, size = lazega_node_count, nei = lazega_k, p = lazega_p) %>%
simplify()
lazega_ws_lst <- run_network_model_assessment(lazega_igraph, run_ws, n_iter = 1000)
plot(lazega_ws_lst$assess_plot)
With the preferential attachment model, we add new nodes and weight the probability of attachment to existing nodes by the degree of each node.
In the simple model, we use a probability weight as
\[ P(v_i) = \frac{d_{v_i}}{\sum_{v_j \in V} d_{v_j}} \]
As the network grows, we have a ‘rich get richer’ effect as nodes on the network tend to get more and more nodes attached to them.
samplepa_igraph <- sample_pa(50, power = 1, m = 1, directed = FALSE)
samplepa_plot <- ggplot(samplepa_igraph
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names)) +
theme_blank()
samplepa_plot %>% plot()
For the degree distribution, we expect a small number of high degree nodes and the rest being low counts.
samplepa_degdist <- igraph::degree(samplepa_igraph)
summary(samplepa_degdist)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 1.00 1.00 1.96 2.00 9.00
ggplot() +
geom_histogram(aes(x = samplepa_degdist), bins = 20) +
xlab("Degree") +
ylab("Node Count")
Asymptotically, the degree distribution tends towards a power law of the form
\[ P(d) \sim d^{-3} \]
Because of this tail effect, we will generate a bigger network and look at the degree distribution.
largepa_degreedist <- sample_pa(1000, power = 1, m = 1, directed = FALSE) %>%
igraph::degree()
summary(largepa_degreedist)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 1.000 1.000 1.998 2.000 31.000
ggplot() +
geom_histogram(aes(x = largepa_degreedist), bins = 50) +
xlab("Degree") +
ylab("Node Count")
We now look at some basic comparisons of the Preferential Attachment model to the Lazega network
lazega_node_count <- lazega_igraph %>% vcount()
run_pa <- function() sample_pa(lazega_node_count, power = 1, m = 1, directed = FALSE) %>%
simplify()
lazega_pa_lst <- run_network_model_assessment(lazega_igraph, run_pa, n_iter = 1000)
plot(lazega_pa_lst$assess_plot)
As we see, the transitivity for the Preferential Attachment models tend to be very low.
The previous models we have used served a purpose, but are limited - these approaches are analogous to building models by fitting distributions.
We now move on to more sophisticated statistical models - Exponential Random Graph Models (ERGMs) in particular. Other approaches exist, such as stochastic block models and latent network models, but we will not have much time to discuss these.
Suppose we have a graph \(G = (V, E)\) - let \(Y\) be the adjacency matrix for this graph and \(y\) is a particular realisation of this graph.
\[ P(Y = y) = \frac{1}{\kappa} \, \exp \left( \sum_H \theta_H g_H(y) \right) \]
where
In simpler terms, we fit the network based on counts of characteristics of the graph such as edges, triangles, stars and anything else we can think of.
We build our first model from edges - we assume only the presence of edges between nodes is relevant for the creation of the graph.
The function summary.statistics counts the various configurations in the network.
summary_formula(lazega_network ~ edges)
## edges
## 115
We now extend this model to see other configuration types including k-stars and so on.
summary_formula(lazega_network ~ edges + kstar(2) + kstar(3) + triangle)
## edges kstar2 kstar3 triangle
## 115 926 2681 120
In practice, adding \(k\)-stars directly as characteristics of networks results in poor fits so we instead have an alternative formulation that allows us to fit for all orders of star effects simultaneously. The ones we discuss are all parameterised allowing control over how the characteristics affect the value.
altkstar\[ \text{AKS}_{\lambda}(y) = \sum_{k=2}^{N_v-1} (-1)^k \frac{S_k(y)}{\lambda^{k-2}} \] where \(S_k(y)\) is the number of \(k\)-stars in the graph.
gwdegree\[ \text{GWD}_{\gamma}(y) = \sum_{d=0}^{N_v-1} e^{-\gamma d} \, N_d(y) \]
where \(N_d(y)\) is the number of vertices of degree \(d\).
gwesp\[ \text{AKT}_{\lambda}(y) = 3T_1 + \sum_{k=2}^{N_v-2} (-1)^{k+1} \frac{T_k(y)}{\lambda^{k-1}} \]
where \(T_k\) is the number of \(k\)-triangles, the set of \(k\) individual triangles sharing a common base.
In our models we use the AKT quantity to match the textbook, but any can be used.
summary_formula(lazega_network ~ edges + gwesp(1, fixed = TRUE))
## edges gwesp.fixed.1
## 115.0000 213.1753
summary_formula(lazega_network ~ edges + triangles + gwdegree(1, fixed = TRUE))
## edges triangle gwdeg.fixed.1
## 115.0 120.0 79.2
So far we have kept our focus on purely topographic properties of the networks, ignoring the attributes of the edges or vertices.
It is natural to expect that the existence or not of an edge between two vertices to also depend on the attributes of those vertices. We can incorporate them into our ERGMs as additional terms.
Vertex attributes can influence a graph in two ways: a value on a vertex may influence the probability of an edge being connected (analogous to a ‘main’ effect in standard modelling), and the values on both vertices may influence the probability (analogous to ‘interactions’ or ‘second-order effects’).
These predictors are added to a formula via the nodemain and match terms.
summary_formula(
lazega_network ~ edges + triangles + gwdegree(1, fixed = TRUE) +
nodemain('Practice') + match('Office')
)
## edges triangle gwdeg.fixed.1 nodecov.Practice
## 115.0 120.0 79.2 359.0
## nodematch.Office
## 85.0
To fit these models, we use an MCMC algorithm to calculate the MLE for the model. The ergm() function performs this optimisation.
We start fitting the model with some simple geometries as predictors.
lazega_01_ergm <- ergm(lazega_network ~ edges + triangles + gwesp(1)
,control = control.ergm(seed = 42)
)
run_ergm <- function() simulate(lazega_01_ergm) %>% intergraph::asIgraph()
lazega_01_lst <- run_network_model_assessment(lazega_igraph, run_ergm, n_iter = 1000)
plot(lazega_01_lst$assess_plot)
summary(lazega_01_ergm)
##
## ==========================
## Summary of model fit
## ==========================
##
## Formula: lazega_network ~ edges + triangles + gwesp(1)
##
## Iterations: 2 out of 20
##
## Monte Carlo MLE Results:
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.80906 0.42812 0 -8.897 < 1e-04 ***
## triangle 0.02121 0.36329 0 0.058 0.953445
## gwesp 1.05610 0.31059 0 3.400 0.000673 ***
## gwesp.decay 0.72635 0.41393 0 1.755 0.079302 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 873.4 on 630 degrees of freedom
## Residual Deviance: 521.3 on 626 degrees of freedom
##
## AIC: 529.3 BIC: 547.1 (Smaller is better.)
We have simulated new graphs from this model using simulate() so we will produce one and then plot the two beside each other.
plot_1 <- ggplot(lazega_igraph, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 5) +
theme_blank()
plot_2 <- ggplot(simulate(lazega_01_ergm), aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_label(aes(label = vertex.names), size = 5) +
theme_blank()
plot_grid(plot_1, plot_2, ncol = 2)
We will have a look at the degree distribution of the original network and the simulated ERGM from it.
lazega_degdist <- lazega_igraph %>%
igraph::degree()
lazega_01_degdist <- simulate(lazega_01_ergm) %>%
intergraph::asIgraph() %>%
igraph::degree()
plot_1 <- ggplot() +
geom_histogram(aes(x = lazega_degdist), binwidth = 1) +
xlab("Degree") +
ylab("Count") +
ggtitle("Original Network")
plot_2 <- ggplot() +
geom_histogram(aes(x = lazega_01_degdist), binwidth = 1) +
xlab("Degree") +
ylab("Count") +
ggtitle("ERGM Simulation")
plot_grid(plot_1, plot_2, ncol = 2)
We now want to see the model running with vertex attributes as part of the model.
lazega_02_ergm <- ergm(lazega_network ~ edges + triangles +
gwesp(1) + nodefactor('Practice') +
nodemain('Seniority') + nodematch('Gender') + match('Office')
,control = control.ergm(seed = 42)
)
run_02_ergm <- function() simulate(lazega_02_ergm) %>% intergraph::asIgraph()
lazega_02_lst <- run_network_model_assessment(lazega_igraph, run_02_ergm, n_iter = 1000)
plot(lazega_02_lst$assess_plot)
summary(lazega_02_ergm)
##
## ==========================
## Summary of model fit
## ==========================
##
## Formula: lazega_network ~ edges + triangles + gwesp(1) + nodefactor("Practice") +
## nodemain("Seniority") + nodematch("Gender") + match("Office")
##
## Iterations: 2 out of 20
##
## Monte Carlo MLE Results:
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -5.973787 0.626433 0 -9.536 < 1e-04 ***
## triangle -0.053230 0.523732 0 -0.102 0.919046
## gwesp 0.886073 0.292143 0 3.033 0.002421 **
## gwesp.decay 0.845804 0.623940 0 1.356 0.175231
## nodefactor.Practice.2 0.441055 0.131391 0 3.357 0.000788 ***
## nodecov.Seniority 0.022161 0.006483 0 3.419 0.000630 ***
## nodematch.Gender 0.658766 0.239477 0 2.751 0.005944 **
## nodematch.Office 1.063957 0.187858 0 5.664 < 1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 873.4 on 630 degrees of freedom
## Residual Deviance: 471.5 on 622 degrees of freedom
##
## AIC: 487.5 BIC: 523.1 (Smaller is better.)
lazega_degdist <- lazega_igraph %>%
igraph::degree()
lazega_02_degdist <- simulate(lazega_02_ergm) %>%
intergraph::asIgraph() %>%
igraph::degree()
plot_1 <- ggplot() +
geom_histogram(aes(x = lazega_degdist), binwidth = 1) +
xlab("Degree") +
ylab("Count") +
ggtitle("Lazega")
plot_2 <- ggplot() +
geom_histogram(aes(x = lazega_01_degdist), binwidth = 1) +
xlab("Degree") +
ylab("Count") +
ggtitle("Model 01")
plot_3 <- ggplot() +
geom_histogram(aes(x = lazega_02_degdist), binwidth = 1) +
xlab("Degree") +
ylab("Count") +
ggtitle("Model 02")
plot_grid(plot_1, plot_2, plot_3, ncol = 3)
The faux.dixon.high dataset is a simulated dataset modelled from high-school friendships. The network is directed, but we fit an undirected version in this workshop.
data(faux.dixon.high)
dixon_igraph <- faux.dixon.high %>%
asIgraph() %>%
as.undirected() %>%
simplify()
dixon_network <- dixon_igraph %>%
asNetwork()
dixon_plot <- ggplot(ggnetwork(dixon_igraph, layout = 'fruchtermanreingold')
,aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges(alpha = 0.1) +
geom_nodes(aes(colour = race), size = 3) +
ggtitle('The Dixon High-school Network') +
theme_blank()
dixon_plot %>% plot()
Before we try the ERGMs, we will use a few basic random graph models first. Models such as these are not able to capture aspects like assortativity, but it may be able to generate the topology at least.
We start fitting a \(G(n,m)\) model, and see how effective we are at capturing structure in the model.
dixon_count_node <- gorder(dixon_igraph)
dixon_count_edge <- gsize (dixon_igraph)
run_gnm <- function() sample_gnm(n = dixon_count_node, dixon_count_edge)
dixon_gnm_lst <- run_network_model_assessment(dixon_igraph, run_gnm, n_iter = 1000)
plot(dixon_gnm_lst$assess_plot)
This model cannot account for the network structure at all.
We try fitting the degree distribution.
dixon_degdist <- dixon_igraph %>% igraph::degree()
run_degdist <- function() sample_degseq(dixon_degdist) %>% simplify()
dixon_degdist_lst <- run_network_model_assessment(dixon_igraph, run_degdist, n_iter = 1000)
plot(dixon_degdist_lst$assess_plot)
This model captures the connectedness of the Dixon network in terms of clusters and components, but the transitivity and avaerage path length is much lower than observed.
Before moving on to ERGMs, we try a PA model.
dixon_node_count <- dixon_igraph %>% vcount()
run_pa <- function() sample_pa(dixon_node_count, power = 1, m = 1, directed = FALSE) %>%
simplify()
dixon_pa_lst <- run_network_model_assessment(dixon_igraph, run_pa, n_iter = 1000)
plot(dixon_pa_lst$assess_plot)
We now move on to fitting ERGMs with this data, and we start with just basic geometries such as edges and \(k\)-cores. Our first model fits on edges only.
dixon_model_01_ergm <- ergm(dixon_network ~ edges
,control = control.ergm(seed = 421)
)
run_01_ergm <- function() simulate(dixon_model_01_ergm) %>% intergraph::asIgraph()
dixon_01_lst <- run_network_model_assessment(dixon_igraph, run_01_ergm, n_iter = 1000)
plot(dixon_01_lst$assess_plot)
summary(dixon_model_01_ergm)
##
## ==========================
## Summary of model fit
## ==========================
##
## Formula: dixon_network ~ edges
##
## Iterations: 6 out of 20
##
## Monte Carlo MLE Results:
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.4117 0.0325 0 -105 <1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 42459 on 30628 degrees of freedom
## Residual Deviance: 8661 on 30627 degrees of freedom
##
## AIC: 8663 BIC: 8672 (Smaller is better.)
Having tried to the edges as a predictor to the model, we now look to add additional geometries.
We start by adding triangles to the model, and checking the diagnostics.
# dixon_triangle_ergm <- ergm(dixon_network ~ edges + triangles
# ,control = control.ergm(seed = 421, MCMLE.maxit = 2)
# )
#
# mcmc.diagnostics(dixon_triangle_ergm)
# first_dixon_model_02_ergm <- quietly(ergm)(dixon_network ~ edges + gwesp(1.0, fixed=TRUE)
# ,control = control.ergm(seed = 422)
# )
dixon_model_02_ergm <- ergm(dixon_network ~ edges + gwesp(0.1, fixed=TRUE)
,control = control.ergm(seed = 422)
)
mcmc.diagnostics(dixon_model_02_ergm)
## Sample statistics summary:
##
## Iterations = 16384:4209664
## Thinning interval = 1024
## Number of chains = 1
## Sample size per chain = 4096
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## edges -10.45 66.52 1.039 7.658
## gwesp.fixed.0.1 -13.03 74.11 1.158 8.493
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## edges -155.0 -52.00 -8.00 35 115.0
## gwesp.fixed.0.1 -170.9 -60.47 -10.76 38 125.3
##
##
## Sample statistics cross-correlations:
## edges gwesp.fixed.0.1
## edges 1.0000000 0.9824547
## gwesp.fixed.0.1 0.9824547 1.0000000
##
## Sample statistics auto-correlation:
## Chain 1
## edges gwesp.fixed.0.1
## Lag 0 1.0000000 1.0000000
## Lag 1024 0.9596911 0.9543651
## Lag 2048 0.9232239 0.9156472
## Lag 3072 0.8901911 0.8804539
## Lag 4096 0.8590697 0.8494106
## Lag 5120 0.8310645 0.8201949
##
## Sample statistics burn-in diagnostic (Geweke):
## Chain 1
##
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5
##
## edges gwesp.fixed.0.1
## -1.531 -1.706
##
## Individual P-values (lower = worse):
## edges gwesp.fixed.0.1
## 0.12572099 0.08797367
## Joint P-value (lower = worse): 0.3191531 .
##
## MCMC diagnostics shown here are from the last round of simulation, prior to computation of final parameter estimates. Because the final estimates are refinements of those used for this simulation run, these diagnostics may understate model performance. To directly assess the performance of the final model on in-model statistics, please use the GOF command: gof(ergmFitObject, GOF=~model).
run_02_ergm <- function() simulate(dixon_model_02_ergm) %>% intergraph::asIgraph()
dixon_02_lst <- run_network_model_assessment(dixon_igraph, run_02_ergm, n_iter = 1000)
plot(dixon_02_lst$assess_plot)
summary(dixon_model_02_ergm)
##
## ==========================
## Summary of model fit
## ==========================
##
## Formula: dixon_network ~ edges + gwesp(0.1, fixed = TRUE)
##
## Iterations: 7 out of 20
##
## Monte Carlo MLE Results:
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -4.91035 0.08064 0 -60.89 <1e-04 ***
## gwesp.fixed.0.1 1.49513 0.07238 0 20.66 <1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 42459 on 30628 degrees of freedom
## Residual Deviance: 7897 on 30626 degrees of freedom
##
## AIC: 7901 BIC: 7918 (Smaller is better.)
model_03_formula <- formula(
dixon_network ~ edges + gwesp(0.1, fixed=TRUE) + absdiff('grade') +
nodefactor('race') + nodefactor('grade') + nodefactor('sex')
)
dixon_model_03_ergm <- ergm(model_03_formula, control = control.ergm(seed = 423))
mcmc.diagnostics(dixon_model_03_ergm)
## Sample statistics summary:
##
## Iterations = 16384:4209664
## Thinning interval = 1024
## Number of chains = 1
## Sample size per chain = 4096
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## edges 7.0862 48.22 0.7535 4.9875
## gwesp.fixed.0.1 11.5981 56.94 0.8897 5.8660
## absdiff.grade -2.7803 40.39 0.6310 2.3694
## nodefactor.race.H -0.2854 12.08 0.1888 0.8892
## nodefactor.race.O 4.5413 12.58 0.1966 0.8814
## nodefactor.race.W 7.4199 60.43 0.9442 5.4554
## nodefactor.grade.8 4.4802 39.36 0.6150 3.6011
## nodefactor.grade.9 7.9612 39.41 0.6158 4.0165
## nodefactor.grade.10 8.0789 36.76 0.5744 3.1281
## nodefactor.grade.11 -10.6321 23.66 0.3697 1.9012
## nodefactor.grade.12 3.1035 24.34 0.3803 2.4566
## nodefactor.sex.2 -0.9460 55.79 0.8717 4.6188
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## edges -95.00 -24.00 8.00 37.00 103.0
## gwesp.fixed.0.1 -109.33 -24.55 12.44 48.11 122.1
## absdiff.grade -82.62 -29.00 -3.00 24.00 77.0
## nodefactor.race.H -26.00 -8.00 0.00 8.00 22.0
## nodefactor.race.O -22.00 -4.00 5.00 14.00 27.0
## nodefactor.race.W -111.62 -33.00 7.00 47.00 129.0
## nodefactor.grade.8 -72.62 -22.00 5.00 32.00 78.0
## nodefactor.grade.9 -73.00 -18.00 9.00 33.00 86.0
## nodefactor.grade.10 -59.00 -19.00 7.00 33.00 82.0
## nodefactor.grade.11 -53.62 -28.00 -10.00 6.00 36.0
## nodefactor.grade.12 -46.00 -13.00 3.00 19.00 51.0
## nodefactor.sex.2 -109.00 -38.00 -1.00 36.00 110.0
##
##
## Sample statistics cross-correlations:
## edges gwesp.fixed.0.1 absdiff.grade nodefactor.race.H
## edges 1.0000000 0.9646805 0.7214587 0.3821703
## gwesp.fixed.0.1 0.9646805 1.0000000 0.6513644 0.3725773
## absdiff.grade 0.7214587 0.6513644 1.0000000 0.3069750
## nodefactor.race.H 0.3821703 0.3725773 0.3069750 1.0000000
## nodefactor.race.O 0.3007446 0.2885035 0.2469147 0.1112186
## nodefactor.race.W 0.9000292 0.8725112 0.6631400 0.2353615
## nodefactor.grade.8 0.5900632 0.5865098 0.3658694 0.3340160
## nodefactor.grade.9 0.6257223 0.6287100 0.4449602 0.2033900
## nodefactor.grade.10 0.5749738 0.5476587 0.4475155 0.2302597
## nodefactor.grade.11 0.4214423 0.3781900 0.3628627 0.0588860
## nodefactor.grade.12 0.4407565 0.4158121 0.2855260 0.1569877
## nodefactor.sex.2 0.8770870 0.8435214 0.6354954 0.3269790
## nodefactor.race.O nodefactor.race.W nodefactor.grade.8
## edges 0.30074455 0.9000292 0.59006321
## gwesp.fixed.0.1 0.28850351 0.8725112 0.58650983
## absdiff.grade 0.24691474 0.6631400 0.36586944
## nodefactor.race.H 0.11121864 0.2353615 0.33401603
## nodefactor.race.O 1.00000000 0.1639279 0.40406293
## nodefactor.race.W 0.16392789 1.0000000 0.47393246
## nodefactor.grade.8 0.40406293 0.4739325 1.00000000
## nodefactor.grade.9 0.08262673 0.6027197 0.18254215
## nodefactor.grade.10 0.11710319 0.4958724 0.12077672
## nodefactor.grade.11 -0.03032231 0.4336636 0.01867281
## nodefactor.grade.12 0.02592101 0.3969571 0.05641579
## nodefactor.sex.2 0.27487062 0.7373593 0.51534420
## nodefactor.grade.9 nodefactor.grade.10 nodefactor.grade.11
## edges 0.62572230 0.5749738 0.42144228
## gwesp.fixed.0.1 0.62871004 0.5476587 0.37818996
## absdiff.grade 0.44496017 0.4475155 0.36286268
## nodefactor.race.H 0.20338996 0.2302597 0.05888600
## nodefactor.race.O 0.08262673 0.1171032 -0.03032231
## nodefactor.race.W 0.60271966 0.4958724 0.43366357
## nodefactor.grade.8 0.18254215 0.1207767 0.01867281
## nodefactor.grade.9 1.00000000 0.1207140 0.12889280
## nodefactor.grade.10 0.12071397 1.0000000 0.17821922
## nodefactor.grade.11 0.12889280 0.1782192 1.00000000
## nodefactor.grade.12 0.16230317 0.1196917 0.17130108
## nodefactor.sex.2 0.50959535 0.5137833 0.40955798
## nodefactor.grade.12 nodefactor.sex.2
## edges 0.44075646 0.8770870
## gwesp.fixed.0.1 0.41581208 0.8435214
## absdiff.grade 0.28552603 0.6354954
## nodefactor.race.H 0.15698774 0.3269790
## nodefactor.race.O 0.02592101 0.2748706
## nodefactor.race.W 0.39695710 0.7373593
## nodefactor.grade.8 0.05641579 0.5153442
## nodefactor.grade.9 0.16230317 0.5095954
## nodefactor.grade.10 0.11969165 0.5137833
## nodefactor.grade.11 0.17130108 0.4095580
## nodefactor.grade.12 1.00000000 0.4077095
## nodefactor.sex.2 0.40770949 1.0000000
##
## Sample statistics auto-correlation:
## Chain 1
## edges gwesp.fixed.0.1 absdiff.grade nodefactor.race.H
## Lag 0 1.0000000 1.0000000 1.0000000 1.0000000
## Lag 1024 0.9271897 0.9293185 0.8271087 0.8907473
## Lag 2048 0.8665905 0.8728258 0.6995408 0.8078518
## Lag 3072 0.8206042 0.8256472 0.6028429 0.7426080
## Lag 4096 0.7791750 0.7852464 0.5259691 0.6813141
## Lag 5120 0.7411313 0.7459924 0.4668279 0.6241464
## nodefactor.race.O nodefactor.race.W nodefactor.grade.8
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 1024 0.8863025 0.9212878 0.9276980
## Lag 2048 0.7986746 0.8554013 0.8682093
## Lag 3072 0.7260903 0.8044192 0.8190175
## Lag 4096 0.6633240 0.7617184 0.7776533
## Lag 5120 0.6074744 0.7222408 0.7396965
## nodefactor.grade.9 nodefactor.grade.10 nodefactor.grade.11
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 1024 0.9320181 0.9242792 0.9097557
## Lag 2048 0.8755592 0.8611241 0.8366270
## Lag 3072 0.8274193 0.8062944 0.7749164
## Lag 4096 0.7862351 0.7574400 0.7210968
## Lag 5120 0.7490401 0.7142373 0.6692680
## nodefactor.grade.12 nodefactor.sex.2
## Lag 0 1.0000000 1.0000000
## Lag 1024 0.9242333 0.9177757
## Lag 2048 0.8683036 0.8477211
## Lag 3072 0.8191517 0.7919432
## Lag 4096 0.7742092 0.7418181
## Lag 5120 0.7397056 0.6954462
##
## Sample statistics burn-in diagnostic (Geweke):
## Chain 1
##
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5
##
## edges gwesp.fixed.0.1 absdiff.grade nodefactor.race.H
## -0.2681 -0.2379 -0.4989 2.0426
## nodefactor.race.O nodefactor.race.W nodefactor.grade.8 nodefactor.grade.9
## 0.2026 -0.1462 -0.1085 0.2540
## nodefactor.grade.10 nodefactor.grade.11 nodefactor.grade.12 nodefactor.sex.2
## -0.4298 -1.3262 0.6351 -0.7523
##
## Individual P-values (lower = worse):
## edges gwesp.fixed.0.1 absdiff.grade nodefactor.race.H
## 0.78862578 0.81198486 0.61787180 0.04108994
## nodefactor.race.O nodefactor.race.W nodefactor.grade.8 nodefactor.grade.9
## 0.83948351 0.88375690 0.91363621 0.79949953
## nodefactor.grade.10 nodefactor.grade.11 nodefactor.grade.12 nodefactor.sex.2
## 0.66731045 0.18478733 0.52536428 0.45188735
## Joint P-value (lower = worse): 0.6313901 .
##
## MCMC diagnostics shown here are from the last round of simulation, prior to computation of final parameter estimates. Because the final estimates are refinements of those used for this simulation run, these diagnostics may understate model performance. To directly assess the performance of the final model on in-model statistics, please use the GOF command: gof(ergmFitObject, GOF=~model).
run_03_ergm <- function() simulate(dixon_model_03_ergm) %>% intergraph::asIgraph()
dixon_03_lst <- run_network_model_assessment(dixon_igraph, run_03_ergm, n_iter = 1000)
plot(dixon_03_lst$assess_plot)
summary(dixon_model_03_ergm)
##
## ==========================
## Summary of model fit
## ==========================
##
## Formula: dixon_network ~ edges + gwesp(0.1, fixed = TRUE) + absdiff("grade") +
## nodefactor("race") + nodefactor("grade") + nodefactor("sex")
##
## Iterations: 6 out of 20
##
## Monte Carlo MLE Results:
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -4.317119 0.167607 0 -25.757 < 1e-04 ***
## gwesp.fixed.0.1 1.192164 0.070367 0 16.942 < 1e-04 ***
## absdiff.grade -0.800613 0.037611 0 -21.286 < 1e-04 ***
## nodefactor.race.H 0.008941 0.095384 0 0.094 0.925316
## nodefactor.race.O 0.215778 0.094059 0 2.294 0.021787 *
## nodefactor.race.W 0.144073 0.043322 0 3.326 0.000882 ***
## nodefactor.grade.8 0.255685 0.074757 0 3.420 0.000626 ***
## nodefactor.grade.9 0.259774 0.072754 0 3.571 0.000356 ***
## nodefactor.grade.10 0.191106 0.072947 0 2.620 0.008798 **
## nodefactor.grade.11 0.102944 0.078288 0 1.315 0.188528
## nodefactor.grade.12 0.330892 0.079163 0 4.180 < 1e-04 ***
## nodefactor.sex.2 0.054042 0.039195 0 1.379 0.167958
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 42459 on 30628 degrees of freedom
## Residual Deviance: 7064 on 30616 degrees of freedom
##
## AIC: 7088 BIC: 7188 (Smaller is better.)
model_04_formula <- formula(
dixon_network ~ edges + gwesp(0.1, fixed=TRUE) + absdiff('grade') +
nodefactor('race') + nodefactor('grade') + nodefactor('sex') +
nodematch('grade', diff=TRUE) + nodematch('sex', diff=FALSE) +
nodematch('race', diff=TRUE) + degree(0:3)
)
dixon_model_04_ergm <- ergm(model_04_formula, control = control.ergm(seed = 423))
mcmc.diagnostics(dixon_model_04_ergm)
## Sample statistics summary:
##
## Iterations = 16384:1063936
## Thinning interval = 1024
## Number of chains = 1
## Sample size per chain = 1024
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## edges -16.9785 46.283 1.44633 8.3110
## gwesp.fixed.0.1 -22.9564 51.798 1.61869 9.6751
## absdiff.grade -20.9639 53.171 1.66160 6.9264
## nodefactor.race.H -4.7002 12.158 0.37995 1.7446
## nodefactor.race.O 11.1797 10.960 0.34251 1.5195
## nodefactor.race.W 9.7998 66.924 2.09137 10.6121
## nodefactor.grade.8 15.0586 33.041 1.03254 5.2796
## nodefactor.grade.9 -12.5127 31.419 0.98185 4.8299
## nodefactor.grade.10 -34.4150 41.597 1.29992 9.5134
## nodefactor.grade.11 -3.6621 32.956 1.02988 7.3281
## nodefactor.grade.12 -32.2910 29.749 0.92966 7.2073
## nodefactor.sex.2 -13.8506 62.106 1.94082 13.8091
## nodematch.grade.7 13.1143 7.374 0.23045 1.6745
## nodematch.grade.8 7.0850 14.079 0.43996 2.1848
## nodematch.grade.9 -2.2188 9.980 0.31187 1.4209
## nodematch.grade.10 -12.2158 17.933 0.56041 4.1801
## nodematch.grade.11 2.1816 9.108 0.28463 1.8361
## nodematch.grade.12 -10.9248 10.702 0.33444 2.7168
## nodematch.sex -10.3330 30.475 0.95235 5.1495
## nodematch.race.B -23.8428 25.539 0.79808 5.1906
## nodematch.race.O 0.8740 1.692 0.05288 0.2044
## nodematch.race.W 3.6025 30.573 0.95542 4.7515
## degree0 -0.7891 3.372 0.10537 0.2927
## degree1 -0.4385 4.839 0.15122 0.5737
## degree2 0.3125 3.905 0.12203 0.3522
## degree3 0.9922 3.409 0.10653 0.1813
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## edges -103.00 -48.25 -16.00 16.00 69.00
## gwesp.fixed.0.1 -123.44 -56.36 -23.61 12.07 78.07
## absdiff.grade -122.42 -59.00 -22.50 16.25 83.42
## nodefactor.race.H -29.42 -13.00 -4.00 3.00 19.00
## nodefactor.race.O -11.00 4.00 11.00 19.00 32.00
## nodefactor.race.W -103.00 -40.00 3.00 55.25 151.27
## nodefactor.grade.8 -45.42 -8.00 13.00 36.00 86.00
## nodefactor.grade.9 -81.85 -32.00 -12.00 9.00 46.42
## nodefactor.grade.10 -114.00 -65.25 -34.00 -8.00 50.42
## nodefactor.grade.11 -70.42 -27.00 -1.50 21.00 52.00
## nodefactor.grade.12 -81.00 -55.00 -36.00 -11.00 29.42
## nodefactor.sex.2 -149.12 -51.00 -11.00 30.00 91.42
## nodematch.grade.7 -2.00 8.00 14.00 19.00 25.00
## nodematch.grade.8 -18.00 -3.00 7.00 16.00 35.00
## nodematch.grade.9 -23.00 -9.00 -2.00 4.00 16.00
## nodematch.grade.10 -46.00 -25.00 -13.00 0.00 22.00
## nodematch.grade.11 -16.00 -5.00 3.00 10.00 16.00
## nodematch.grade.12 -30.00 -19.00 -12.00 -4.00 12.00
## nodematch.sex -70.42 -31.00 -9.00 10.25 51.00
## nodematch.race.B -76.42 -39.00 -22.00 -9.00 27.42
## nodematch.race.O -3.00 0.00 1.00 2.00 4.00
## nodematch.race.W -50.42 -21.00 1.00 25.00 66.42
## degree0 -8.00 -3.00 -1.00 2.00 5.00
## degree1 -10.00 -3.25 0.00 3.00 9.00
## degree2 -8.00 -2.00 1.00 3.00 7.00
## degree3 -6.00 -1.00 1.00 3.00 7.00
##
##
## Sample statistics cross-correlations:
## edges gwesp.fixed.0.1 absdiff.grade nodefactor.race.H
## edges 1.00000000 0.95384485 0.68566559 0.36383713
## gwesp.fixed.0.1 0.95384485 1.00000000 0.60858962 0.36736231
## absdiff.grade 0.68566559 0.60858962 1.00000000 0.12709729
## nodefactor.race.H 0.36383713 0.36736231 0.12709729 1.00000000
## nodefactor.race.O 0.08191919 0.05131889 0.09025739 -0.09596391
## nodefactor.race.W 0.77308435 0.72917700 0.59976306 0.21205102
## nodefactor.grade.8 0.45958873 0.49668384 0.11131902 0.28120662
## nodefactor.grade.9 0.54930873 0.55067272 0.51398179 0.20349543
## nodefactor.grade.10 0.56157953 0.55601571 0.08349276 0.27838261
## nodefactor.grade.11 0.67207452 0.59917667 0.64138097 0.21691363
## nodefactor.grade.12 0.38795441 0.34972513 0.49936470 0.15250336
## nodefactor.sex.2 0.85305786 0.81613322 0.52866341 0.25083533
## nodematch.grade.7 0.09967086 0.03563099 0.09464906 -0.23569177
## nodematch.grade.8 0.35553401 0.40536955 -0.07409172 0.26832399
## nodematch.grade.9 0.26860535 0.29375973 0.04478334 0.19983663
## nodematch.grade.10 0.36193638 0.38333460 -0.16478313 0.22369831
## nodematch.grade.11 0.65080312 0.58763446 0.49871850 0.16873518
## nodematch.grade.12 0.19452064 0.17443671 0.15861892 0.16342517
## nodematch.sex 0.86295775 0.83755996 0.62437614 0.41600049
## nodematch.race.B 0.55078401 0.54313858 0.35236715 0.04262810
## nodematch.race.O -0.11328183 -0.12991111 -0.06560869 -0.03598567
## nodematch.race.W 0.67991463 0.64967437 0.56162331 0.10919850
## degree0 -0.33859986 -0.21640637 -0.34181474 -0.02019061
## degree1 -0.46998072 -0.38999655 -0.41615249 -0.08753016
## degree2 -0.37600226 -0.38242548 -0.19057173 -0.17487794
## degree3 -0.16653321 -0.18910606 -0.08048920 -0.09473136
## nodefactor.race.O nodefactor.race.W nodefactor.grade.8
## edges 0.081919190 0.77308435 0.459588731
## gwesp.fixed.0.1 0.051318892 0.72917700 0.496683843
## absdiff.grade 0.090257395 0.59976306 0.111319021
## nodefactor.race.H -0.095963908 0.21205102 0.281206616
## nodefactor.race.O 1.000000000 0.01324250 0.166316453
## nodefactor.race.W 0.013242496 1.00000000 0.281487951
## nodefactor.grade.8 0.166316453 0.28148795 1.000000000
## nodefactor.grade.9 0.099610698 0.32533365 0.074832708
## nodefactor.grade.10 -0.146007469 0.38472400 0.141278738
## nodefactor.grade.11 -0.011447884 0.68480887 0.045802525
## nodefactor.grade.12 -0.043109065 0.38755931 -0.075280208
## nodefactor.sex.2 0.001046168 0.66478398 0.383354066
## nodematch.grade.7 0.291205222 0.06668934 0.076571024
## nodematch.grade.8 0.115004669 0.21497686 0.914192205
## nodematch.grade.9 0.050138018 0.04469168 0.007791087
## nodematch.grade.10 -0.204658594 0.20749389 0.141482654
## nodematch.grade.11 0.018365930 0.64056712 0.154920725
## nodematch.grade.12 -0.035283150 0.18947423 -0.072978728
## nodematch.sex -0.003505196 0.63493614 0.449889469
## nodematch.race.B -0.101509060 -0.02906049 0.253056761
## nodematch.race.O 0.494857761 -0.20828536 -0.093937232
## nodematch.race.W -0.032677515 0.97233837 0.220701347
## degree0 -0.088129762 -0.26651796 -0.028425496
## degree1 -0.182152303 -0.34929522 -0.124043586
## degree2 0.031026831 -0.34777572 -0.127586956
## degree3 0.066726985 -0.12918866 -0.076064600
## nodefactor.grade.9 nodefactor.grade.10 nodefactor.grade.11
## edges 0.54930873 0.56157953 0.672074521
## gwesp.fixed.0.1 0.55067272 0.55601571 0.599176674
## absdiff.grade 0.51398179 0.08349276 0.641380969
## nodefactor.race.H 0.20349543 0.27838261 0.216913628
## nodefactor.race.O 0.09961070 -0.14600747 -0.011447884
## nodefactor.race.W 0.32533365 0.38472400 0.684808872
## nodefactor.grade.8 0.07483271 0.14127874 0.045802525
## nodefactor.grade.9 1.00000000 0.14638176 0.258703890
## nodefactor.grade.10 0.14638176 1.00000000 0.224772842
## nodefactor.grade.11 0.25870389 0.22477284 1.000000000
## nodefactor.grade.12 0.04324650 -0.09709743 0.413385111
## nodefactor.sex.2 0.36517214 0.63166576 0.578640597
## nodematch.grade.7 0.01084266 -0.13875495 -0.134835128
## nodematch.grade.8 -0.04898413 0.15442000 0.003563855
## nodematch.grade.9 0.74115001 0.15938246 -0.012781211
## nodematch.grade.10 -0.03349089 0.91431323 0.024175872
## nodematch.grade.11 0.15360009 0.23863531 0.914102915
## nodematch.grade.12 -0.09983698 -0.12889268 0.185509877
## nodematch.sex 0.51488285 0.44845640 0.590463316
## nodematch.race.B 0.41340920 0.38310742 0.202667748
## nodematch.race.O -0.07099704 -0.10290744 -0.175449132
## nodematch.race.W 0.26979210 0.33037094 0.644415228
## degree0 -0.07206649 -0.03482880 -0.300044141
## degree1 -0.19496071 -0.10408753 -0.383065520
## degree2 -0.09444225 -0.25944448 -0.340475865
## degree3 -0.07683843 -0.18488740 -0.167611759
## nodefactor.grade.12 nodefactor.sex.2 nodematch.grade.7
## edges 0.38795441 0.853057865 0.09967086
## gwesp.fixed.0.1 0.34972513 0.816133217 0.03563099
## absdiff.grade 0.49936470 0.528663407 0.09464906
## nodefactor.race.H 0.15250336 0.250835331 -0.23569177
## nodefactor.race.O -0.04310907 0.001046168 0.29120522
## nodefactor.race.W 0.38755931 0.664783979 0.06668934
## nodefactor.grade.8 -0.07528021 0.383354066 0.07657102
## nodefactor.grade.9 0.04324650 0.365172136 0.01084266
## nodefactor.grade.10 -0.09709743 0.631665757 -0.13875495
## nodefactor.grade.11 0.41338511 0.578640597 -0.13483513
## nodefactor.grade.12 1.00000000 0.199757048 -0.11804272
## nodefactor.sex.2 0.19975705 1.000000000 0.13275997
## nodematch.grade.7 -0.11804272 0.132759972 1.00000000
## nodematch.grade.8 -0.09835173 0.293807744 0.03786883
## nodematch.grade.9 -0.12444216 0.168156062 -0.03817967
## nodematch.grade.10 -0.18622286 0.513753873 -0.13273128
## nodematch.grade.11 0.35567464 0.596286000 -0.05286312
## nodematch.grade.12 0.88188375 0.008183632 -0.15546702
## nodematch.sex 0.32901245 0.724951223 0.01779858
## nodematch.race.B 0.15247379 0.490630149 0.07086830
## nodematch.race.O -0.04506448 -0.118526165 0.29023893
## nodematch.race.W 0.35448529 0.604227917 0.05702076
## degree0 -0.22026105 -0.282507460 -0.41325031
## degree1 -0.27890898 -0.397394777 -0.34499960
## degree2 -0.25281001 -0.330855347 0.07920872
## degree3 -0.06723533 -0.155495413 0.21530476
## nodematch.grade.8 nodematch.grade.9 nodematch.grade.10
## edges 0.355534009 0.268605346 0.361936378
## gwesp.fixed.0.1 0.405369549 0.293759732 0.383334597
## absdiff.grade -0.074091720 0.044783339 -0.164783125
## nodefactor.race.H 0.268323986 0.199836633 0.223698307
## nodefactor.race.O 0.115004669 0.050138018 -0.204658594
## nodefactor.race.W 0.214976860 0.044691679 0.207493890
## nodefactor.grade.8 0.914192205 0.007791087 0.141482654
## nodefactor.grade.9 -0.048984127 0.741150015 -0.033490888
## nodefactor.grade.10 0.154420003 0.159382463 0.914313231
## nodefactor.grade.11 0.003563855 -0.012781211 0.024175872
## nodefactor.grade.12 -0.098351730 -0.124442161 -0.186222862
## nodefactor.sex.2 0.293807744 0.168156062 0.513753873
## nodematch.grade.7 0.037868830 -0.038179666 -0.132731280
## nodematch.grade.8 1.000000000 0.007458416 0.196533943
## nodematch.grade.9 0.007458416 1.000000000 0.120609909
## nodematch.grade.10 0.196533943 0.120609909 1.000000000
## nodematch.grade.11 0.125355867 -0.058861703 0.086252705
## nodematch.grade.12 -0.063595479 -0.127769855 -0.142899722
## nodematch.sex 0.333268070 0.281987184 0.271731505
## nodematch.race.B 0.168102979 0.278166922 0.292535652
## nodematch.race.O -0.093602213 -0.032894202 -0.084464013
## nodematch.race.W 0.165485171 -0.010835302 0.167790025
## degree0 0.033722280 0.096802021 0.034185364
## degree1 -0.072669607 -0.053076896 -0.009494584
## degree2 -0.156911917 -0.053452254 -0.201771764
## degree3 -0.080479129 -0.064442515 -0.160009308
## nodematch.grade.11 nodematch.grade.12 nodematch.sex
## edges 0.65080312 0.194520636 0.862957751
## gwesp.fixed.0.1 0.58763446 0.174436709 0.837559955
## absdiff.grade 0.49871850 0.158618923 0.624376136
## nodefactor.race.H 0.16873518 0.163425167 0.416000488
## nodefactor.race.O 0.01836593 -0.035283150 -0.003505196
## nodefactor.race.W 0.64056712 0.189474225 0.634936141
## nodefactor.grade.8 0.15492073 -0.072978728 0.449889469
## nodefactor.grade.9 0.15360009 -0.099836983 0.514882855
## nodefactor.grade.10 0.23863531 -0.128892685 0.448456404
## nodefactor.grade.11 0.91410291 0.185509877 0.590463316
## nodefactor.grade.12 0.35567464 0.881883751 0.329012450
## nodefactor.sex.2 0.59628600 0.008183632 0.724951223
## nodematch.grade.7 -0.05286312 -0.155467024 0.017798575
## nodematch.grade.8 0.12535587 -0.063595479 0.333268070
## nodematch.grade.9 -0.05886170 -0.127769855 0.281987184
## nodematch.grade.10 0.08625270 -0.142899722 0.271731505
## nodematch.grade.11 1.00000000 0.174082853 0.565190243
## nodematch.grade.12 0.17408285 1.000000000 0.130953197
## nodematch.sex 0.56519024 0.130953197 1.000000000
## nodematch.race.B 0.23198721 0.062545537 0.515596319
## nodematch.race.O -0.15563053 -0.006116195 -0.168281991
## nodematch.race.W 0.60276300 0.158215242 0.554906968
## degree0 -0.32970491 -0.148373782 -0.262910863
## degree1 -0.36783793 -0.148304675 -0.395968593
## degree2 -0.35759402 -0.183566490 -0.315805474
## degree3 -0.14241756 -0.039558842 -0.156220879
## nodematch.race.B nodematch.race.O nodematch.race.W
## edges 0.55078401 -0.113281833 0.67991463
## gwesp.fixed.0.1 0.54313858 -0.129911108 0.64967437
## absdiff.grade 0.35236715 -0.065608690 0.56162331
## nodefactor.race.H 0.04262810 -0.035985670 0.10919850
## nodefactor.race.O -0.10150906 0.494857761 -0.03267752
## nodefactor.race.W -0.02906049 -0.208285364 0.97233837
## nodefactor.grade.8 0.25305676 -0.093937232 0.22070135
## nodefactor.grade.9 0.41340920 -0.070997037 0.26979210
## nodefactor.grade.10 0.38310742 -0.102907441 0.33037094
## nodefactor.grade.11 0.20266775 -0.175449132 0.64441523
## nodefactor.grade.12 0.15247379 -0.045064477 0.35448529
## nodefactor.sex.2 0.49063015 -0.118526165 0.60422792
## nodematch.grade.7 0.07086830 0.290238933 0.05702076
## nodematch.grade.8 0.16810298 -0.093602213 0.16548517
## nodematch.grade.9 0.27816692 -0.032894202 -0.01083530
## nodematch.grade.10 0.29253565 -0.084464013 0.16779003
## nodematch.grade.11 0.23198721 -0.155630526 0.60276300
## nodematch.grade.12 0.06254554 -0.006116195 0.15821524
## nodematch.sex 0.51559632 -0.168281991 0.55490697
## nodematch.race.B 1.00000000 -0.058877800 -0.08658555
## nodematch.race.O -0.05887780 1.000000000 -0.22275508
## nodematch.race.W -0.08658555 -0.222755084 1.00000000
## degree0 -0.21721766 -0.047767953 -0.22818706
## degree1 -0.28182406 -0.062744523 -0.29050360
## degree2 -0.14643129 0.029782926 -0.32963914
## degree3 -0.09141757 0.084905655 -0.12094633
## degree0 degree1 degree2 degree3
## edges -0.33859986 -0.469980716 -0.37600226 -0.16653321
## gwesp.fixed.0.1 -0.21640637 -0.389996552 -0.38242548 -0.18910606
## absdiff.grade -0.34181474 -0.416152494 -0.19057173 -0.08048920
## nodefactor.race.H -0.02019061 -0.087530158 -0.17487794 -0.09473136
## nodefactor.race.O -0.08812976 -0.182152303 0.03102683 0.06672698
## nodefactor.race.W -0.26651796 -0.349295218 -0.34777572 -0.12918866
## nodefactor.grade.8 -0.02842550 -0.124043586 -0.12758696 -0.07606460
## nodefactor.grade.9 -0.07206649 -0.194960706 -0.09444225 -0.07683843
## nodefactor.grade.10 -0.03482880 -0.104087530 -0.25944448 -0.18488740
## nodefactor.grade.11 -0.30004414 -0.383065520 -0.34047587 -0.16761176
## nodefactor.grade.12 -0.22026105 -0.278908983 -0.25281001 -0.06723533
## nodefactor.sex.2 -0.28250746 -0.397394777 -0.33085535 -0.15549541
## nodematch.grade.7 -0.41325031 -0.344999599 0.07920872 0.21530476
## nodematch.grade.8 0.03372228 -0.072669607 -0.15691192 -0.08047913
## nodematch.grade.9 0.09680202 -0.053076896 -0.05345225 -0.06444252
## nodematch.grade.10 0.03418536 -0.009494584 -0.20177176 -0.16000931
## nodematch.grade.11 -0.32970491 -0.367837930 -0.35759402 -0.14241756
## nodematch.grade.12 -0.14837378 -0.148304675 -0.18356649 -0.03955884
## nodematch.sex -0.26291086 -0.395968593 -0.31580547 -0.15622088
## nodematch.race.B -0.21721766 -0.281824055 -0.14643129 -0.09141757
## nodematch.race.O -0.04776795 -0.062744523 0.02978293 0.08490566
## nodematch.race.W -0.22818706 -0.290503598 -0.32963914 -0.12094633
## degree0 1.00000000 0.103685550 -0.03841935 -0.19571509
## degree1 0.10368555 1.000000000 -0.07824897 -0.16180126
## degree2 -0.03841935 -0.078248972 1.00000000 -0.04703310
## degree3 -0.19571509 -0.161801257 -0.04703310 1.00000000
##
## Sample statistics auto-correlation:
## Chain 1
## edges gwesp.fixed.0.1 absdiff.grade nodefactor.race.H
## Lag 0 1.0000000 1.0000000 1.0000000 1.0000000
## Lag 1024 0.9242438 0.9139287 0.8518386 0.8974191
## Lag 2048 0.8608504 0.8456854 0.7312233 0.8178966
## Lag 3072 0.8137024 0.7982228 0.6494202 0.7480497
## Lag 4096 0.7718377 0.7516315 0.5920513 0.6906253
## Lag 5120 0.7380983 0.7168389 0.5400459 0.6356014
## nodefactor.race.O nodefactor.race.W nodefactor.grade.8
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 1024 0.8705889 0.9251571 0.9036154
## Lag 2048 0.7713145 0.8576732 0.8285579
## Lag 3072 0.6897132 0.8015374 0.7721872
## Lag 4096 0.6319803 0.7471249 0.7169249
## Lag 5120 0.5849210 0.7009479 0.6687871
## nodefactor.grade.9 nodefactor.grade.10 nodefactor.grade.11
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 1024 0.9003404 0.9462080 0.9563349
## Lag 2048 0.8177591 0.9004086 0.9197094
## Lag 3072 0.7575317 0.8645901 0.8847389
## Lag 4096 0.7056203 0.8363806 0.8517691
## Lag 5120 0.6551942 0.8111863 0.8242450
## nodefactor.grade.12 nodefactor.sex.2 nodematch.grade.7
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 1024 0.9423195 0.9365560 0.9256685
## Lag 2048 0.8944708 0.8834308 0.8769837
## Lag 3072 0.8587071 0.8376272 0.8323979
## Lag 4096 0.8240021 0.7970441 0.7923418
## Lag 5120 0.7978995 0.7649794 0.7628393
## nodematch.grade.8 nodematch.grade.9 nodematch.grade.10
## Lag 0 1.0000000 1.0000000 1.0000000
## Lag 1024 0.9219861 0.8940983 0.9607779
## Lag 2048 0.8512698 0.8141317 0.9271287
## Lag 3072 0.7895566 0.7430683 0.8946723
## Lag 4096 0.7342734 0.6795219 0.8675095
## Lag 5120 0.6825197 0.6282148 0.8446192
## nodematch.grade.11 nodematch.grade.12 nodematch.sex nodematch.race.B
## Lag 0 1.0000000 1.0000000 1.0000000 1.0000000
## Lag 1024 0.9426041 0.9530666 0.9097992 0.9488344
## Lag 2048 0.8998734 0.9135046 0.8344462 0.9054133
## Lag 3072 0.8617286 0.8832580 0.7775583 0.8666203
## Lag 4096 0.8265265 0.8515591 0.7314050 0.8305514
## Lag 5120 0.7904181 0.8295898 0.6869687 0.7971026
## nodematch.race.O nodematch.race.W degree0 degree1 degree2
## Lag 0 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
## Lag 1024 0.8421620 0.9222060 0.5001191 0.5616592 0.3841844
## Lag 2048 0.7300774 0.8478610 0.3651428 0.4699470 0.2792663
## Lag 3072 0.6456496 0.7844441 0.3168853 0.4247313 0.2415317
## Lag 4096 0.5687765 0.7266842 0.2865979 0.3810733 0.2385126
## Lag 5120 0.5078653 0.6726247 0.2459384 0.3490170 0.2368517
## degree3
## Lag 0 1.00000000
## Lag 1024 0.31636913
## Lag 2048 0.20735040
## Lag 3072 0.16874312
## Lag 4096 0.08715043
## Lag 5120 0.10666600
##
## Sample statistics burn-in diagnostic (Geweke):
## Chain 1
##
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5
##
## edges gwesp.fixed.0.1 absdiff.grade nodefactor.race.H
## -1.1767 -1.0597 -2.5617 0.8500
## nodefactor.race.O nodefactor.race.W nodefactor.grade.8 nodefactor.grade.9
## -3.4585 0.3416 1.8735 -1.6980
## nodefactor.grade.10 nodefactor.grade.11 nodefactor.grade.12 nodefactor.sex.2
## 4.9591 -0.6569 -4.3958 1.7658
## nodematch.grade.7 nodematch.grade.8 nodematch.grade.9 nodematch.grade.10
## -1.3485 1.7056 -1.2024 5.7242
## nodematch.grade.11 nodematch.grade.12 nodematch.sex nodematch.race.B
## -0.0783 -3.0072 -0.2559 -1.9378
## nodematch.race.O nodematch.race.W degree0 degree1
## -1.0781 0.4992 2.5715 3.5649
## degree2 degree3
## 0.8702 -1.9651
##
## Individual P-values (lower = worse):
## edges gwesp.fixed.0.1 absdiff.grade nodefactor.race.H
## 2.393137e-01 2.892603e-01 1.041653e-02 3.953377e-01
## nodefactor.race.O nodefactor.race.W nodefactor.grade.8 nodefactor.grade.9
## 5.431581e-04 7.326577e-01 6.100420e-02 8.950899e-02
## nodefactor.grade.10 nodefactor.grade.11 nodefactor.grade.12 nodefactor.sex.2
## 7.080492e-07 5.112441e-01 1.103830e-05 7.742978e-02
## nodematch.grade.7 nodematch.grade.8 nodematch.grade.9 nodematch.grade.10
## 1.775054e-01 8.807514e-02 2.292277e-01 1.039401e-08
## nodematch.grade.11 nodematch.grade.12 nodematch.sex nodematch.race.B
## 9.375889e-01 2.636692e-03 7.980435e-01 5.264917e-02
## nodematch.race.O nodematch.race.W degree0 degree1
## 2.809779e-01 6.176603e-01 1.012668e-02 3.639784e-04
## degree2 degree3
## 3.842013e-01 4.940837e-02
## Joint P-value (lower = worse): 0 .
##
## MCMC diagnostics shown here are from the last round of simulation, prior to computation of final parameter estimates. Because the final estimates are refinements of those used for this simulation run, these diagnostics may understate model performance. To directly assess the performance of the final model on in-model statistics, please use the GOF command: gof(ergmFitObject, GOF=~model).
run_04_ergm <- function() simulate(dixon_model_04_ergm) %>% intergraph::asIgraph()
dixon_04_lst <- run_network_model_assessment(dixon_igraph, run_04_ergm, n_iter = 1000)
plot(dixon_04_lst$assess_plot)
summary(dixon_model_04_ergm)
##
## ==========================
## Summary of model fit
## ==========================
##
## Formula: dixon_network ~ edges + gwesp(0.1, fixed = TRUE) + absdiff("grade") +
## nodefactor("race") + nodefactor("grade") + nodefactor("sex") +
## nodematch("grade", diff = TRUE) + nodematch("sex", diff = FALSE) +
## nodematch("race", diff = TRUE) + degree(0:3)
##
## Iterations: 20 out of 20
##
## Monte Carlo MLE Results:
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -6.33731 0.39412 0 -16.080 < 1e-04 ***
## gwesp.fixed.0.1 0.91132 0.07869 0 11.581 < 1e-04 ***
## absdiff.grade -0.40366 0.06266 0 -6.442 < 1e-04 ***
## nodefactor.race.H 1.32241 0.15615 0 8.469 < 1e-04 ***
## nodefactor.race.O 1.45324 0.15437 1 9.414 < 1e-04 ***
## nodefactor.race.W 0.29450 0.16208 1 1.817 0.069218 .
## nodefactor.grade.8 0.03297 0.18856 0 0.175 0.861184
## nodefactor.grade.9 0.68988 0.18356 0 3.758 0.000171 ***
## nodefactor.grade.10 -0.01004 0.17158 0 -0.059 0.953326
## nodefactor.grade.11 0.36446 0.17497 0 2.083 0.037247 *
## nodefactor.grade.12 0.42708 0.18301 0 2.334 0.019613 *
## nodefactor.sex.2 0.04557 0.03941 1 1.156 0.247611
## nodematch.grade.7 1.91276 0.42246 0 4.528 < 1e-04 ***
## nodematch.grade.8 1.55322 0.22881 0 6.788 < 1e-04 ***
## nodematch.grade.9 0.13926 0.20073 0 0.694 0.487841
## nodematch.grade.10 1.57035 0.20147 1 7.795 < 1e-04 ***
## nodematch.grade.11 0.79214 0.33627 0 2.356 0.018492 *
## nodematch.grade.12 0.89454 0.32536 0 2.749 0.005971 **
## nodematch.sex 0.28495 0.07284 0 3.912 < 1e-04 ***
## nodematch.race.B 2.22791 0.19794 1 11.255 < 1e-04 ***
## nodematch.race.H -Inf 0.00000 0 -Inf < 1e-04 ***
## nodematch.race.O -1.09269 0.76852 0 -1.422 0.155078
## nodematch.race.W 1.43677 0.20005 0 7.182 < 1e-04 ***
## degree0 2.56651 0.56109 0 4.574 < 1e-04 ***
## degree1 2.38440 0.40060 0 5.952 < 1e-04 ***
## degree2 1.29152 0.38197 0 3.381 0.000722 ***
## degree3 0.57815 0.37767 0 1.531 0.125811
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 42459 on 30628 degrees of freedom
## Residual Deviance: NaN on 30601 degrees of freedom
##
## AIC: NaN BIC: NaN (Smaller is better.)
##
## Warning: The following terms have infinite coefficient estimates:
## nodematch.race.H
sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 3.5.1 (2018-07-02)
## os Debian GNU/Linux 9 (stretch)
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz Etc/UTC
## date 2020-03-27
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date lib source
## assertthat 0.2.0 2017-04-11 [1] CRAN (R 3.5.1)
## backports 1.1.3 2018-12-14 [1] CRAN (R 3.5.1)
## bindr 0.1.1 2018-03-13 [1] CRAN (R 3.5.1)
## bindrcpp * 0.2.2 2018-03-29 [1] CRAN (R 3.5.1)
## broom 0.5.1 2018-12-05 [1] CRAN (R 3.5.1)
## cellranger 1.1.0 2016-07-27 [1] CRAN (R 3.5.1)
## cli 1.0.1 2018-09-25 [1] CRAN (R 3.5.1)
## coda 0.19-2 2018-10-08 [1] CRAN (R 3.5.1)
## colorspace 1.3-2 2016-12-14 [1] CRAN (R 3.5.1)
## conflicted * 1.0.1 2018-10-02 [1] CRAN (R 3.5.1)
## cowplot * 0.9.3 2018-07-15 [1] CRAN (R 3.5.1)
## crayon 1.3.4 2017-09-16 [1] CRAN (R 3.5.1)
## DEoptimR 1.0-8 2016-11-19 [1] CRAN (R 3.5.1)
## digest 0.6.18 2018-10-10 [1] CRAN (R 3.5.1)
## dplyr * 0.7.8 2018-11-10 [1] CRAN (R 3.5.1)
## ergm * 3.9.4 2018-08-16 [1] CRAN (R 3.5.1)
## evaluate 0.12 2018-10-09 [1] CRAN (R 3.5.1)
## fansi 0.4.0 2018-10-05 [1] CRAN (R 3.5.1)
## forcats * 0.3.0 2018-02-19 [1] CRAN (R 3.5.1)
## generics 0.0.2 2018-11-29 [1] CRAN (R 3.5.1)
## ggnetwork * 0.5.1 2016-03-25 [1] CRAN (R 3.5.1)
## ggplot2 * 3.1.0 2018-10-25 [1] CRAN (R 3.5.1)
## ggrepel * 0.8.0 2018-05-09 [1] CRAN (R 3.5.1)
## glue 1.3.0 2018-07-17 [1] CRAN (R 3.5.1)
## gtable 0.2.0 2016-02-26 [1] CRAN (R 3.5.1)
## haven 2.0.0 2018-11-22 [1] CRAN (R 3.5.1)
## hms 0.4.2 2018-03-10 [1] CRAN (R 3.5.1)
## htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.5.1)
## httr 1.4.0 2018-12-11 [1] CRAN (R 3.5.1)
## igraph * 1.2.2 2018-07-27 [1] CRAN (R 3.5.1)
## igraphdata * 1.0.1 2015-07-13 [1] CRAN (R 3.5.1)
## intergraph * 2.0-2 2016-12-05 [1] CRAN (R 3.5.1)
## jsonlite 1.6 2018-12-07 [1] CRAN (R 3.5.1)
## knitr 1.21 2018-12-10 [1] CRAN (R 3.5.1)
## labeling 0.3 2014-08-23 [1] CRAN (R 3.5.1)
## lattice 0.20-35 2017-03-25 [2] CRAN (R 3.5.1)
## lazyeval 0.2.1 2017-10-29 [1] CRAN (R 3.5.1)
## lpSolve 5.6.13 2015-09-19 [1] CRAN (R 3.5.1)
## lubridate 1.7.4 2018-04-11 [1] CRAN (R 3.5.1)
## magrittr 1.5 2014-11-22 [1] CRAN (R 3.5.1)
## MASS 7.3-50 2018-04-30 [2] CRAN (R 3.5.1)
## Matrix 1.2-14 2018-04-13 [2] CRAN (R 3.5.1)
## memoise 1.1.0 2017-04-21 [1] CRAN (R 3.5.1)
## modelr 0.1.2 2018-05-11 [1] CRAN (R 3.5.1)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 3.5.1)
## network * 1.13.0.1 2018-04-02 [1] CRAN (R 3.5.1)
## nlme 3.1-137 2018-04-07 [2] CRAN (R 3.5.1)
## pillar 1.3.1 2018-12-15 [1] CRAN (R 3.5.1)
## pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.5.1)
## plyr 1.8.4 2016-06-08 [1] CRAN (R 3.5.1)
## purrr * 0.2.5 2018-05-29 [1] CRAN (R 3.5.1)
## R6 2.3.0 2018-10-04 [1] CRAN (R 3.5.1)
## Rcpp 1.0.0 2018-11-07 [1] CRAN (R 3.5.1)
## readr * 1.3.0 2018-12-11 [1] CRAN (R 3.5.1)
## readxl 1.2.0 2018-12-19 [1] CRAN (R 3.5.1)
## rlang 0.3.0.1 2018-10-25 [1] CRAN (R 3.5.1)
## rmarkdown 1.11 2018-12-08 [1] CRAN (R 3.5.1)
## robustbase 0.93-3 2018-09-21 [1] CRAN (R 3.5.1)
## rstudioapi 0.8 2018-10-02 [1] CRAN (R 3.5.1)
## rvest 0.3.2 2016-06-17 [1] CRAN (R 3.5.1)
## sand * 1.0.3 2017-03-02 [1] CRAN (R 3.5.1)
## scales * 1.0.0 2018-08-09 [1] CRAN (R 3.5.1)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.5.1)
## sna * 2.4 2016-08-08 [1] CRAN (R 3.5.1)
## statnet.common * 4.1.4 2018-06-22 [1] CRAN (R 3.5.1)
## stringi 1.2.4 2018-07-20 [1] CRAN (R 3.5.1)
## stringr * 1.3.1 2018-05-10 [1] CRAN (R 3.5.1)
## tibble * 1.4.2 2018-01-22 [1] CRAN (R 3.5.1)
## tidyr * 0.8.2 2018-10-28 [1] CRAN (R 3.5.1)
## tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.5.1)
## tidyverse * 1.2.1 2017-11-14 [1] CRAN (R 3.5.1)
## trust 0.1-7 2015-07-04 [1] CRAN (R 3.5.1)
## utf8 1.1.4 2018-05-24 [1] CRAN (R 3.5.1)
## withr 2.1.2 2018-03-15 [1] CRAN (R 3.5.1)
## xfun 0.4 2018-10-23 [1] CRAN (R 3.5.1)
## xml2 1.2.0 2018-01-24 [1] CRAN (R 3.5.1)
## yaml 2.2.0 2018-07-25 [1] CRAN (R 3.5.1)
##
## [1] /usr/local/lib/R/site-library
## [2] /usr/local/lib/R/library